Exploring meta-data of human vaginal microbiome

Group 6

Alberte Englund
Mathilde Due
Line Winther Gormsen
Sigrid Frandsen
Kristine Johansen

Examples:

Bare for at vide de ting vi kan bruge i præsentations-dokumentet. Her skriver jeg i bold. Her skriver jeg i kursiv. Her skriver jeg i bold og kursiv. Her skriver jeg i rød

Study Description

  • Dataset: The genome meta-data-set from the human vaginal microbiome.
  • Aim: Uncover patterns in genome quality, taxonomic composition, and ecological characteristics, while demonstrating the principles of reproducible and collaborative data science in R.

Pause

content before the pause

. . .

content after the pause

Examples:

Several columns

Left column

Right column

Examples: smaller text on slide

slide so you can make things smaller if content can not fit. - Bullet Point 1

  • Bullet Point 2

  • Bullet Point 3

  • Bullet Point 4

  • Bullet Point 5

  • Bullet Point 6

Examples: scrollable slide

slide so you can scroll to see the rest of the content if to much. - Bullet Point 1

  • Bullet Point 2

  • Bullet Point 3

  • Bullet Point 4

  • Bullet Point 5

  • Bullet Point 6

  • Bullet Point 7

  • Bullet Point 8

  • Bullet Point 9

  • Bullet Point 10

  • Bullet Point 11

  • Bullet Point 12

  • Bullet Point 13

Asides and footnotes

Asides

Slide content

Footnotes

  • Green 1
  • Brown
  • Purple

Study Description

Meta-data from MGnify’s vaginal microbiome genome catalogue

  • Uncover patterns in genome quality, taxonomic composition, and ecological characteristics.

  • Uncover potential patterns for diagnosis of endometriosis via associated pathogens:

    • Anaerococcus, Ureaplasma, Gardnerella, Veillonella, Corynebacterium, Peptoniphilus, Candida albicans, Alloscardovia

Untidy –> tidy data

  1. Each variable is saved in its own column.
  2. Each observation is saved in its own row.
  3. Each “type” of observation is stored in a single table.
# A tibble: 618 × 20
   Genome        Genome_type  Length N_contigs    N50 GC_content Completeness
   <chr>         <chr>         <dbl>     <dbl>  <dbl>      <dbl>        <dbl>
 1 MGYG000303700 MAG          678213         2 466332       47.8         63.7
 2 MGYG000303701 MAG         1500176        18 112881       42.4         87.8
 3 MGYG000303702 MAG         1210062        44  48790       26.4         94.8
 4 MGYG000303703 MAG         1706016        27  89653       44.6         93.7
 5 MGYG000303704 MAG          703182         7 111709       47.8         63.7
 6 MGYG000303705 MAG         2542045       112  34925       48           97.9
 7 MGYG000303706 MAG         1449687       185  10153       34.8         85.2
 8 MGYG000303707 MAG         1874692        90  28768       37.1         99.0
 9 MGYG000303708 MAG         1480380        12 169949       42.2         87.6
10 MGYG000303709 MAG          694644        57  15063       47.9         62.0
# ℹ 608 more rows
# ℹ 13 more variables: Contamination <dbl>, rRNA_5S <dbl>, rRNA_16S <dbl>,
#   rRNA_23S <dbl>, tRNAs <dbl>, Genome_accession <chr>, Species_rep <chr>,
#   Lineage <chr>, Sample_accession <chr>, Study_accession <chr>,
#   Country <chr>, Continent <chr>, FTP_download <chr>

About the data set / project

Meta-data from MGnify’s vaginal microbiome genome catalogue

  • Something about our data
  • nogle distributions og måske plots til overview af vores data.

Cleaning of data

# A tibble: 618 × 25
   Genome        Genome_type  Length N_contigs    N50 GC_content Completeness
   <chr>         <chr>         <dbl>     <dbl>  <dbl>      <dbl>        <dbl>
 1 MGYG000303700 MAG          678213         2 466332       47.8         63.7
 2 MGYG000303701 MAG         1500176        18 112881       42.4         87.8
 3 MGYG000303702 MAG         1210062        44  48790       26.4         94.8
 4 MGYG000303703 MAG         1706016        27  89653       44.6         93.7
 5 MGYG000303704 MAG          703182         7 111709       47.8         63.7
 6 MGYG000303705 MAG         2542045       112  34925       48           97.9
 7 MGYG000303706 MAG         1449687       185  10153       34.8         85.2
 8 MGYG000303707 MAG         1874692        90  28768       37.1         99.0
 9 MGYG000303708 MAG         1480380        12 169949       42.2         87.6
10 MGYG000303709 MAG          694644        57  15063       47.9         62.0
# ℹ 608 more rows
# ℹ 18 more variables: Contamination <dbl>, rRNA_5S <dbl>, rRNA_16S <dbl>,
#   rRNA_23S <dbl>, tRNAs <dbl>, Country <chr>, Continent <chr>, Domain <chr>,
#   Phylum <chr>, Class <chr>, Order <chr>, Family <chr>, Genus <chr>,
#   Species <chr>, Completeness_quality <chr>, Contamination_quality <chr>,
#   Overall_quality <chr>, endometriosis_associated <lgl>

Analyse 1

Analyse 2

Analysis 3 – Are endometriosis-associated MAGs genomically distinct?

Do endometriosis-associated MAGs differ genomically from non-associated MAGs?

Data & features compared

  • GC content
  • Genome length
  • Completeness & contamination
  • Phylum distribution

Key findings

  • Endometriosis-associated MAGs occur mainly in a few major phyla
  • GC content distributions overlap almost completely between groups
  • Genome sizes show no clear separation
  • Assembly quality is similarly high in both groups

Conclusion

  • No clear genomic differences between endometriosis-associated and non-associated MAGs

Analyse 4

Results

Description of data

Analysis 1

måske nogle plots

Analysis 2

måske nogle plots

Analysis 3

måske nogle plots

Analysis 4

måske nogle plots

Analysis 5

måske nogle plots

Discussion

Hvis vi har nogle punkter her

Fx noget vi skulle have gjort anderledes

eller mangler/begrænsninger i datasættet

Future Perspectives

Hvad kan det bruges til

Hvad kunne have været nice at lave ellers.